-
Notifications
You must be signed in to change notification settings - Fork 7.2k
[data][llm] fix vllm ray data quickstart example #58463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[data][llm] fix vllm ray data quickstart example #58463
Conversation
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Root cause: Quickstart guide merged last night
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request reduces the max_model_len in the vLLM Ray Data quickstart example to prevent out-of-memory errors on GPUs with less memory. This is a sensible change that makes the example more accessible and robust for users with varying hardware configurations. The change is correct and I have no further comments.
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: YK <1811651+ykdojo@users.noreply.github.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
Fix OOM / gpu memory constraint by setting
max_model_lenin quickstart